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Abstract. 

A four states phase diagram for protein folding as a function of temperature and solvent 
quality is derived from an improved 2-d lattice model taking into account the temperature 
dependence of the hydrophobic effect. The phase diagram exhibits native, globule and two 
coil-type regions. In agreement with experiment, the model reproduces the phase transitions 
indicative of both warm and cold denaturations. Finally, it predicts transitions between the 
two coil states and a critical point. 



Understanding the physical mechanism underlying protein folding remains one of the main 
open problems of contemporary theoretical biophysics. The interplay between protein-protein 
and protein-solvent interactions that drive the folding of the polypeptide may be partly inves- 
tigated using full atomistic representations. Computer simulations at this level of detail are 
shown for instance to provide crucial information about the stability of the proteins around 
its native structure. Such calculations are however very time consuming and not appropriate 
for characterizing the large conformational space of multimcric chains, which is a crucial step 
toward understanding the folding problem. [1]. 

This has led to the emergence of alternative approaches, such as the use of simpler coarse 
grained models. Among these, the lattice model is probably the most popular and efficient 
model that allows a wide sampling of the conformational space of a given polypeptide chain [2] . 
Accordingly, the 16-mer placed on a two dimensional lattice has often been used to this 
end. [3-5] Such a chain is long enough to capture fundamental mechanism of protein folding 
and short enough to allow the calculation of partition function by a full enumeration in 
reasonable computer times. 

Over a decade ago, Dinner et al. [3] used such a model to derive the three-states phase 
diagrams of 16 mers for different chain sequences as a function of temperature and average 
attraction between monomers. Coil, globule and native states were all obtained but the model 
failed to reproduce the well known cold denaturation. This transition, from the native to the 
coil state, upon lowering the temperature, consists in the loss of the order of the chain [6]. 

It was indeed later shown that the accuracy of the potential describing the interactions 
with the solvent is crucial [7] . We have recently proposed a refinement of the coupling model 
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that explicitly includes a temperature dependent so-called hydrophobic effect in a solvation 
free energy contribution [8]. This model, considering the same 16-mer chain, predicts the 
existence of the coil and a native states, the warm and cold denaturation transition but 
produces no globule states. Despite this last shortcoming the coupling models were shown 
to be consistent with all-atom molecular dynamics simulations of a short peptide solvated in 
water [9]. In the recent litterature, other models dealing with the cold denaturation have also 
been proposed [10]. 

In this paper, we extend on previous calculations and propose a more comprehensive model 
of the hydrophobic effect that reproduces a four states phase diagram with both the warm 
and cold denaturation transitions. In the model, all the links between two adjacent nods 
of the lattice are taken into account (see an example in figlQ. The effective hamiltonian of 




Fig. 1 - One conformation, of a 16 monomers chain (filled circles) on a two-dimensional lattice. The 
thick solid lines represent the covalent bonds and the springs the intrachain contacts. The solvent 
sites are depicted as squares each of which is divided into four solvent cells (triangles). Solvent- 
solvent interactions involve two adjacent solvent cells (clear triangles) whereas a solvent-monomer 
bond involves a monomer and a nearest solvent cell (grey triangles). 

conformation m is given by: 

n^(T)=E^+F^(T) (1) 

The intrachain interaction energy for each conformation m is described as in Dinner et 
al. [3]: 

JV 

E^ = ^ Bij A^ (2) 

where Bij is the specific interaction between residue sites i and j and A^™^ equals 1 if i and j 
are in contact and otherwise. Monomer- monomer interactions are real numbers selected 
randomly from a normalized probability density - Gaussian distribution - with a standard 
deviation a — 2. One single conformation, noted Nat, is selected at random among the more 
maximally compact structures, and considered as the native conformation of the sequence. 
Nat has 9 intrachain contacts. In the spirit of the Go-model [11], the corresponding values of 
interactions are described by the 9 smallest values of B^ . 

The free energy of solvation for each conformation may be written as a sum of two contri- 
butions: 

N 

*£SCO = ErtCO + M m) fs(T) (3) 

Where nf 1 ^ and nf 1 ^ are respectively the number of solvent sites surrounding residue i and 
the total number of solvent-solvent contacts. fi(T) is the specific free energy of a solvent cell 
in interaction with residue i and f s (T), that of a neat solvent cell. 
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Taken the extended structures (without any intrachain contacts) as the free energy refer- 
ence, the effective hamiltonian may be rewritten as a summation of effective couplings between 
monomers (see the example of figure 0) : 



N 

E 

l>j 



Btf{T) A<™> 



with 



Bf (T) = B l3 - ft (T) - (T) + 2f s (T) (4) 



■ : T * * f : ■ 
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Fig. 2 - Structure A and _B differs only by the contact between monomers 5 and 16. The effective cou- 
plings between these monomers is simply the effective hamiltonian difference between conformations 
A and B, calculated by a counting of the lattice links : B%% = B 6 ,i6 - M T ) - fie(T) + 2f s (B s ,T) 



Recently, Silverstein et al. [12] gave a description of the hydrophobic effect in terms of two 
energy spectra that best fits their simulation data. These results exhibits a low degenerated, 
narrow, (respectively high degenerated, extended,) spectra for neat water (respectively for 
aqueous solution with a non polar solute). Here, this physical picture is reduced further. 
The energy spectra of the solvent in interaction with monomer i consists of N s energy values 
B\ 3 \ (j = l,N s ) selected fro m a Gaussian distribution with standard deviation a, while the 
energy spectrum of the neat solvent is given by a unique level, 7V"-fold degenerated, of energy 
B s . Small values of B s models bad solvent and large values good solvent. Extending on our 
previous model, [8] we introduce here an extra parameter a, representing the degeneracy ratio 
between the bulk and the first shell solvent cells. As the total degeneracy of the latter is 
higher than that of the former [12], one has a < 1 and these degeneracies being related to the 
number of solvent configurations N s is a large number. 

Accordingly, the free energies associated with the neat solvent and that of solvation of 
each monomer i are respectively given by : 



f,[B„T) = B s -aT\nN s 
MT) = -T In Zi (T) 



(5) 
(6) 



where Zi(T) is the partition function of the solvent around monomer i. For large values of N s , 
it may be written using a continuous formalism as : 



Zi(T) = N s 



,/{B, ! -'XI) j 



dB, 



(7) 



where n(Bi) is the normalized Gaussian distribution truncated at B™ 
to each residue: 

if B < BT n 



(i) 

mirij B\ ■ , specific 
(8) 



n{Bi) 



{-&) 



f a/27T crfc 



T y/2 



if 



B > Bf 
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Equation (6) may therefore be rewritten as: 



fa 2 \ CTfc (!9 



erfc 



The density of probability that the smallest value of the N s set, chosen at random with a 



g Ns (B mm ) = N s g{B min ) V{x > B min yvs-i ( 10 ) 

where g(B min ) is the density of probability to select B min and V(x> B min ) the probability to 
draw a value x larger than B. Thus, for each residue i, B™ m is selected from the probability 
density: 

The state of the chain under each set of conditions is determined from statistical equilib- 
rium averages. For an observable X^ m \ the average over peptide structures may be defined 
as: 



(X(B S ,T)) = J2 XWPW{B„T) (12) 

with 



m— 1 



exp 

P eq (B s ,T) = — — ^ -fa- (13) 

E™=iexp(-%S 

This expression allows the estimate of the chain entropy, S c h(B s , T) = — (lnP eg ), the com- 
pactness of the peptide, defined as the average of iV c (m) = | £,>j A|™ } where is 
the number of intra-chain contacts of structure to, and the order of the peptide, defined as 
the average of Q^- m \ the pairwise contact overlap of the structures with the native conforma- 
tion ^Q (m ^ = |. J2f>j ^i^^fj^ ■ The number of contacts of the more maximally compact 
structures (i.e. 9 for the specific chain length studied here), appears in the two above averages 
in order to normalized them to 1. 

For a 16-mer chain model on a two dimensional lattice, the total number of structures is 
"tot = 802075 among which n oxt = 116579 have zero contact. For given values of B s and T, 
the point state in the phase diagram is determined by (Q), (N c ) and S c h- When (Q) > 0.66, 
the peptide is considered in the Native phase. When (Q) < 0.66 and (N c ) > 0.66, only 
some compact structures are relevant, and the chain is in the so-called globule state. When 
(N c ) < 0.66 and 5 c h = lnn oxt , the peptide is mainly in the extended conformation and the 
phase is coil type II. Last, when (N c ) < 0.66 and S c h > lnn cxt , almost all chain structures 
have a non zero probability to occur. This state is referred to as coil-type I. 

By setting the model parameters to a = and a = 1, the temperature dependence of 
the hydrophobic effect is effectively removed. Under such conditions, the corresponding phase 
diagrams is similar to that determined by Dinner et al. [3] . On the other hand, the two states 
phase diagram where the warm and cold denaturation are present [8] may be obtained by 
setting a = 2, a = 0.5 and N s = 10 5 . 

For discussing the four state phase diagram, we set, in the following, the model parameters 
to a = 2, a = 0.9 and N s — 10 5 . The results are mildly sequence dependant. We therefore 
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Fig. 3 - Averages of the order parameter (Q), the compactness (N 3 ) and the chain entropy S c h as 
functions of B s and T. 



select a particular sequence, and investigate its corresponding phase behavior. (N c ), (Q) and 
S c h are displayed in fig|2|as a function of B s and T. Several qualitative features are directly 
observed from the 3-d plots. These may be classified depending on B s as follow: 

For B s < —7.5, the (Q) plots indicate that the peptide is in the native phase at low 
temperature and in denatured phase at high temperature. Depending on the B s value, the 
transitions in (Q) and (N c ) take place at different temperatures, noted hereafter T w and T ex 
respectively (i.e. (Q){B S ,T W {B S )) = 0.66 and (N c ){B Sl T cx (B s )) = 0.66). As for T > T ex , the 
chain entropy becomes an increasing function of temperature (up to lnntot)> one may identify 
three regions corresponding to the following phases: a coil type I phase for temperatures above 
Tex, a globule phase between T w and T ex , and a native state below T w . 

For —7.5 < B s < —2.5, in addition to the states described above, transitions toward 
denatured states ((Q) —* 0) take place at low temperatures. Such transitions, occurring at 
temperatures T c that depend on B s , represent cold denaturation. Below T c , the chain entropy 
is constant and equals In n cxt which indicates that the low temperature region corresponds to 
the coil type II state. 

For B s values above -2.5, (Q), (N c ) are very small. The chain is always in a coil state, 
regardless of the temperature. These values of B s are therefore indicative of good solvation. 
Different states are however observed as shown from the (N c ) and 5 c h plots (figQJ. At low 
temperature, the compactness is rigourously null and the chain entropy equals lnn ex t, indi- 
cating that the peptide is in coil type II state. As the temperature increases, so does the 
entropy until reaching lnrt to t and the chain is in coil type I phase. To better delineate the 
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Globule™?, 4 



frontier between the coil type I and coil type II regions, we have estimated numerically T^gr 1 , 
the contribution of the chain to the heat capacity of the system as a function of temperature. 
For B s < 2.0, these contributions undergo a maxima at T = Td, which is a signature of a first 
order disordered-disordered transition between the two coil phases. For B s > 2.0, the peak is 
no longer observed. 

5 
4 
3 
2 
1 




Fig. 5 - Phase diagram of the chain as a function of the solvent quality and the temperature. All the 
structures are observed in the coil type I, only the extended ones in the coil type II, only the compact 
ones in the globule and the sole native conformation in the nat region. 




The previous results are summarized in the phase diagram reported in figEl For the 
particular sequence considered here, four states are distinguished. The native, globule and 
coil-type I phases coexist at the triple point: (B S ,T) — (—3.4, 1.40) and the native, coil-type 
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I and coil-type II phases coexist at (B S ,T) = (—2.4,0.72). A critical point is observed at 
(B S ,T) C — (2.0,4.0). Thus, moving along the B s and T axes, transitions from Coil Type I 
and Coil Type II without crossing any peak in the heat capacity are allowed. Very small and 
smooth variations in S c h, (Q) and (N c ) occur on these ways. This confirms that Coil Type 
I and Coil Type II are two phases of the extended state, which implies that warm and cold 
denaturations are, indeed, transitions toward the same extented state. The existence of a 
hypothetical supercritical phase for B s > 2.0 or T > 4.0 is not clear. The nature of the set of 
structures relevant in such an speculative region should be investigated by the detailed study 
of effective hamiltonian spectra as function of the temperature. 

Last, in the simulations performed with a — 0.5 [8], the Globule and Coil Type I phases 
disappear leaving only the warm and the cold transitions between Coil Type II and Nat. 

In summary, we have shown that the suitable solvation model presented in this paper 
allows to calculate, for the first time, a four-state phase diagram of a peptide chain. One 
would need, however, to elucidate the physical meaning of all the model parameters N s , a 
and a and their relative values for the 20 natural amino acids for a complete understanding 
of the mechanism responsible for protein folding. Last, it must be understood that similar 
phase diagrams are obtained if the same value of B™ ln is affected to every residue. However, 
we choose to select one value of Bf 1 ™ for each residue in order to model the specifity of the 
hydrophobicity of each monomer of the protein. 

* * * 

It is a pleasure to acknowledge Mounir Tarek for helpful discussions and critical reading 
of the manuscript. 
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